Skip to content

Conversation

@danielle-pinto
Copy link
Collaborator

@danielle-pinto danielle-pinto commented Jan 30, 2026

Making a draft PR here. There's multiple ways to solve the problem, and I added a first approach. I'm thinking that the second would be a more statistical/simulation approach. Basically, based on the values of k, m, n, we can make a vector containing all of the possible organisms (eg. [HH, Hh, hh, HH, etc.]). Then, we can calculate the percentage of dominant individuals/total individuals.

Wanted to run this by you first and see if you had any suggestions on packages to use.

@github-actions
Copy link

Once the build has completed, you can preview your PR at this URL: https://biojulia.dev/BiojuliaDocs/previews/PR16/

@kescobo
Copy link
Member

kescobo commented Feb 2, 2026

Once the build has completed, you can preview your PR at this URL: https://biojulia.dev/BiojuliaDocs/previews/PR16/

Just noting that the comment is being made, but the link doesn't actually work.

Probably unrelated to the above, your pull request is for some reason requesting to merge into another branch, rather than into main
image

Copy link
Member

@kescobo kescobo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution would be to use StatsBase.jl and do a weighted probability.

One other thing that would be nice to include here is a bit more didactic discussion about how often times we make algorithms that are narrowly tailored, but then we either repeat ourselves or get more complicated as additional requirements get tacked on. Eg, for this problem, your solution works for the specific problem, but we'd have to derive a new equation if the question is something like "What's the probability of a heterozygous offspring?" It also doesn't scale up if we add another trait etc.

Nice thing about the StatsBase.jl solution and even a simulation is that they can be made generic and then can be used to ask more types of questions. I'm not necessarily demanding we add this to a first draft, but maybe open an issue as a potential enhancement.


!!! warning "The Problem"

Probability is the mathematical study of randomly occurring phenomena. We will model such a phenomenon with a random variable, which is simply a variable that can take a number of different distinct outcomes depending on the result of an underlying random process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Probability is the mathematical study of randomly occurring phenomena. We will model such a phenomenon with a random variable, which is simply a variable that can take a number of different distinct outcomes depending on the result of an underlying random process.
Probability is the mathematical study of randomly occurring phenomena.
We will model such a phenomenon with a random variable,
which is simply a variable that can take a number of different distinct outcomes
depending on the result of an underlying random process.

Semantic line breaks please - it makes editing and diffs much nicer.

One way to help remember is to turn off automatic line breaks in your text editor.


### Deriving an Algorithm

Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having a recessive allele. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having a recessive allele. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait.
Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having only recessive alleles. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "having the recessive phenotype".

@kescobo
Copy link
Member

kescobo commented Feb 2, 2026

I like the idea of a simulation, though it will generally not give a precisely correct answer for rosalind. I think that's fine if that's explained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants